ni and Hastie, 2007]. The outliers are defined by ሼݕ∈ݕ: ݕ
ሻܫܴܳሺܠ, ܡሻሽ. The OS t statistics is defined as below, where
number of the case expressions. The OS p value is also calculated
permutation approach.
ݐୗൌ
∑
ሺݕെߣሻ
ୀଵ
ߪ
(6.11)
T
erator was further revised in the outlier robust t statistic algorithm
n ORT, the range of outlier discovery was enlarged using the
rter percentile [Wu, 2007]. OS and ORT employ a similar
strategy. OS makes a division between the outlier and non-outlier
ns while ORT divides the control expressions using the pooled
ns. ORT uses the same statistic but defines the outliers in relation
e expressions only:
ሼݕ∈ݕ: ݕݍହሺݔሻIQRሺݔሻሽ
(6.12)
OST
imum ordered subset t statistic algorithm (MOST) employs a
method [Lian, 2008]. For both the control expressions and the case
ns, two median values are calculated. They are ߤ௫ൌmedianሺܠሻ
medianሺܡሻ. Based on these two median values, the median of
ences between expressions and expression medians is calculated
following equation,
߱ൌ1.4826 ൈmedian൛|ܠെߤ௫|, หܡെߤ௬หൟ
(6.13)
andard Gaussian distribution is generated as a benchmark
on, which has a zero mean and a unit standard deviation. The
such a standard Gaussian distribution is named by ߴ. Suppose k